Goto

Collaborating Authors

 foundation model


AIhub monthly digest: May 2026 – AI for science, the lottery ticket hypothesis, and world models

AIHub

Welcome to our monthly digest, where you can catch up with any AIhub stories you may have missed, peruse the latest news, recap recent events, and more. This month, we learn about AI for science, delve into world models, research transparent and trustworthy AI, and hear about the lottery ticket hypothesis. The latest interview in our series with the AAAI/SIGAI Doctoral Consortium participants featured Ximing Wen who is researching transparent and trustworthy AI systems. We found out more about her work, her experience as a research intern, and what inspired her to study AI. In this wide-ranging conversation, Jonathan Frankle delves into empiricism versus theoretical proofs, how the approach to computer science has changed (even if the fundamental problems haven't), how younger researchers are rapidly adapting to a world that values impact above all else, and what it means to be a researcher.


Assessing the Operational Viability of Foundation Models for Time Series Forecasting

arXiv.org Machine Learning

Time series forecasting drives operational decisions in areas like finance, transportation, and energy. While supervised learning approaches achieve strong performance, they require domain-specific training, feature engineering, and ongoing maintenance. Large-scale foundation models have recently emerged as a zero-shot alternative, avoiding task-specific training much like LLMs. In this work, we evaluate foundation models against standard supervised approaches. Rather than focusing solely on aggregate accuracy, we analyze performance across four operational regimes: periodic human-centric systems, physically constrained processes, stochastic financial markets, and heterogeneous demand forecasting. Our results characterize optimal deployment areas. Foundation models perform well in domains with transferable periodic structures and are efficient for cold-start or long-tail scenarios. Conversely, supervised specialists maintain higher precision in systems governed by strict physical constraints. In financial domains, newer foundation models are rapidly closing the performance gap with supervised specialists. We further quantify trade-offs in inference latency, data drift adaptability, and deployment constraints. Finally, we propose a Complexity Router that assigns each series to the optimal model class using empirical features. We demonstrate that this selective routing achieves higher accuracy and significantly lower inference costs compared to deploying a universal foundation model, providing a practical framework for balancing generalization and efficiency.


Isotonic Survival Regression: Calibrated Survival Distributions from Deep Cox Models

arXiv.org Machine Learning

Time-to-event data is widespread across the life sciences and engineering, but it is typically encountered together with censoring, which complicates the application of standard machine learning methods. Deep Cox models have emerged as a popular method for analyzing time-to-event data because they gracefully handle censoring and can be used with unstructured data such as clinical text reports, genomic sequences, and pathology images. However, their predicted survival probabilities are often poorly calibrated, thus limiting their practical utility. In this paper, we propose a novel post hoc calibration method for Deep Cox models that uses isotonic regression to refine predicted survival probabilities without affecting discriminative power. We establish favorable theoretical guarantees, including a double-robustness property and asymptotic calibration. Experiments on synthetic and real-world clinical data demonstrate the empirical effectiveness of our method.


ISOMORPH: A Supply Chain Digital Twin for Simulation, Dataset Generation, and Forecasting Benchmarks

arXiv.org Machine Learning

Open time-series forecasting (TSF) benchmarks cover retail, energy, weather, and traffic, but supply-chain logistics remains underserved. We introduce ISOMORPH, the first public digital twin of a multi-echelon logistics network with fully interpretable, user-configurable parameters and modular topology, demand process, and control rules. The simulator advances a directed routing graph in discrete time: demand arrives at the destination, is served from stock or recorded as backlog, and triggers replenishment through the network. The state vector tracks per-node on-hand inventory with outstanding orders, in-transit shipments, and a smoothed demand estimate, so the dynamics close as a Markov chain on a tractable state space whose transition kernel acts linearly on the empirical distribution of the state. The released data reproduces the bullwhip effect at empirically consistent magnitudes, and three conservation laws encoded in the Markov chain serve as verification tools when users extend the simulator. We release datasets at two catalogue scales ($C=50$ and $C=200$) with six scenario sweeps producing 30 additional rollouts and 20 Latin-hypercube perturbations, exhibiting dynamics absent from fixed TSF benchmarks: variance amplification, cascading bottlenecks, regime shifts, and cross-channel coupling through shared macro shocks. Zero-shot evaluation of four foundation models (Chronos, Moirai, TimesFM, Lag-Llama) shows MASE values exceeding public GIFT-Eval references at low-to-moderate horizons, supporting incorporation into existing benchmarks. The same pairing produces forecast confidence bands via Latin-hypercube perturbation of demand-side knobs, forward UQ from parameter uncertainty unavailable on standard TSF datasets, demonstrating that foundation models can serve as fast surrogates for the digital twin's forward UQ. Code (MIT): https://github.com/tuhinsahai/ISOMORPH.


Reports of the Workshops Held at the 2026 AAAI Conference on Artificial Intelligence

Interactive AI Magazine

The 10th International Workshop on Health Intelligence (W3PHIAI-26) celebrated a decade of bringing AI and health research together, building on a lineage that began with the AAAI-W3PHI workshops focused on population health (2014-2016), the AAAI-HIAI workshops focused on personalized health (2013-2016), and the subsequent joint W3PHIAI workshops held annually from 2017 through 2025. Over this decade, the series has produced hundreds of talks and high-impact publications that have collectively received thousands of citations, shaping the research agenda in both population health intelligence and personalized healthcare AI. This year's special theme, "Foundation Models and AI Agents," reflected the field's rapidly evolving frontier: the emergence of autonomous and semi-autonomous AI systems reshaping clinical workflows, patient management, health system operations, and public health surveillance. Day 1 of the workshop focused on medical imaging and the translation of AI for clinical ...


Amortizing Causal Sensitivity Analysis via Prior Data-Fitted Networks

arXiv.org Machine Learning

Causal sensitivity analysis aims to provide bounds for causal effect estimates in the presence of unobserved confounding. However, existing methods for causal sensitivity analysis are per-instance procedures, meaning that changes to the dataset, causal query, sensitivity level, or treatment require new computation. Here, we instead present an in-context learning approach. Specifically, we propose an amortized approach to causal sensitivity analysis based on prior-data fitted networks. A key challenge is that the sensitivity bounds are not directly available when sampling training data. To address this, we develop a general prior-data construction that is applicable across the class of generalized treatment sensitivity models. Our construction involves a Lagrangian scalarization of the objective to generate training labels for the bounds through a tradeoff between causal effect min/max-imization and sensitivity model violation, which avoids model-specific analytical derivations. We further show that, under standard convexity and linearity conditions, our objective recovers the full Pareto frontier of solutions. Empirically, we demonstrate our amortized approach across various datasets, causal queries, and sensitivity levels, where our approach achieves a test-time computation that is orders of magnitude faster than per-instance methods. To the best of our knowledge, ours is the first foundation model for in-context learning for causal sensitivity analysis.


TabCF: Distributional Control Function Estimation with Tabular Foundation Models

arXiv.org Machine Learning

Instrumental variable (IV) and control function (CF) methods are powerful tools for causal effect estimation in the presence of unmeasured confounding, yet most existing approaches target only mean effects and/or demand substantial fitting and tuning effort. In this paper, we introduce a simple method, TabCF, for control function regression using tabular foundation models, which enables accurate, fast, identification-transparent, and tuning-light causal estimation of distributional quantities, such as interventional means and quantiles; we also propose a copula-based approximation for multivariate outcomes. TabCF performs favorably against representative methods across a broad range of small- to medium-sized synthetic and real data scenarios. The central message is two-fold: for practitioners, it highlights that TabCF is an effective tool for distributional causal inference; for researchers, it suggests that the proposed approach could be considered a strong baseline for future method development. Code is available at https://github.com/GepingChen/TabCF.


Report on foundation model impacts released

AIHub

Partnership on AI has published a progress report on post-deployment governance practices pertaining to foundation models. The document, entitled " 2026 Transparency Report on Foundation Model Impacts ", measures the progress of 13 foundation model providers* in publicly documenting the impacts of their foundation models. In carrying out their analysis, authors Jacob Pratt and Albert Tanjaya reviewed more than 150 papers, articles, websites, and reports. For assessment, these four practices were broken down into 19 processes, or activities, that support how foundation model providers adopt practices. Although several leading organizations are defining what information to share and how, the rest are slow in adopting information-sharing practices.


AI for Science – from cosmology to chemistry

AIHub

On the 31st March, our editorial team headed to the Royal Society for AI for Science . This day-long conference explored how AI is changing the nature of scientific discovery, and was hosted by the Fundamental Research team from the Alan Turing Institute. Nestled in a terrace of 19th century townhouses along the banks of the Thames, the Royal Society looks as grand as the names who have passed through its doors throughout the years. Prof Jason McEwen, Chief Scientist for the Turing Institute, opened the event with an insightful talk on the nature of scientific revolution, and how the bidirectional relationship between AI and science could spark the next one. Then, Prof Anna Scaife from the University of Manchester spoke on the use of foundation models for astronomical discovery.